Evaluation of Jackknife and Bootstrap for Defining Confidence Intervals for Pairwise Agreement Measures

نویسندگان

  • Ana Severiano
  • João A. Carriço
  • D. Ashley Robinson
  • Mário Ramirez
  • Francisco R. Pinto
چکیده

Several research fields frequently deal with the analysis of diverse classification results of the same entities. This should imply an objective detection of overlaps and divergences between the formed clusters. The congruence between classifications can be quantified by clustering agreement measures, including pairwise agreement measures. Several measures have been proposed and the importance of obtaining confidence intervals for the point estimate in the comparison of these measures has been highlighted. A broad range of methods can be used for the estimation of confidence intervals. However, evidence is lacking about what are the appropriate methods for the calculation of confidence intervals for most clustering agreement measures. Here we evaluate the resampling techniques of bootstrap and jackknife for the calculation of the confidence intervals for clustering agreement measures. Contrary to what has been shown for some statistics, simulations showed that the jackknife performs better than the bootstrap at accurately estimating confidence intervals for pairwise agreement measures, especially when the agreement between partitions is low. The coverage of the jackknife confidence interval is robust to changes in cluster number and cluster size distribution.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

روش‌های بازنمونه‌گیری بوت استرپ و جک نایف در تحلیل بقای بیماران مبتلا به تالاسمی ماژور

Background and Objectives: A small sample size can influence the results of statistical analysis. A reduction in the sample size may happen due to different reasons, such as loss of information, i.e. existing missing value in some variables. This study aimed to apply bootstrap and jackknife resampling methods in survival analysis of thalassemia major patients. Methods: In this historical coh...

متن کامل

Bootstrap confidence intervals of CNpk for type‑II generalized log‑logistic distribution

This paper deals with construction of confidence intervals for process capability index using bootstrap method (proposed by Chen and Pearn in Qual Reliab Eng Int 13(6):355–360, 1997) by applying simulation technique. It is assumed that the quality characteristic follows type-II generalized log-logistic distribution introduced by Rosaiah et al. in Int J Agric Stat Sci 4(2):283–292, (2008). Discu...

متن کامل

Minimal Values for Reliability of Bootstrap and Jackknife Proportions, Decay Index, and Bayesian Posterior Probability

Although optimal cladograms based on real data sets are readily demonstrated to be well loaded with phylogenetic data, statistical means of evaluating dependability of details of branch arrangements have been problematic. Exact values of four measures of branch arrangement reliability nonparametric bootstrap and jackknife proportions, the Decay Index, and Bayesian posterior probabilities were o...

متن کامل

Confidence Intervals for Random Forests Confidence Intervals for Random Forests: The Jackknife and the Infinitesimal Jackknife

We study the variability of predictions made by bagged learners and random forests, and show how to estimate standard errors for these methods. Our work builds on variance estimates for bagging proposed by Efron (1992, 2012) that are based on the jackknife and the infinitesimal jackknife (IJ). In practice, bagged predictors are computed using a finite number B of bootstrap replicates, and worki...

متن کامل

Jackknife and Bootstrap Methods for Variance Estimation from Sample Survey Data

Re-sampling methods have long been used in survey sampling, dating back to Mahalanobis (1946). More recently, jackknife and bootstrap resampling methods have also been proposed for small area estimation; in particular for mean squared error (MSE) estimation and for constructing confidence intervals. We present a brief overview of early uses of resampling methods in survey sampling, and then pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2011